-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[lld-macho] Include branch extension thunks in linker map #120496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This patch extends the MachO linker's map file generation to include branch extension thunk symbols. Previously, thunks were omitted from the map file, making it difficult to understand the final layout of the binary, especially when debugging issues related to long branch thunks. This change ensures thunks are included and correctly interleaved with other symbols based on their address, providing an accurate representation of the linked output.
|
@llvm/pr-subscribers-lld-macho @llvm/pr-subscribers-lld Author: None (alx32) ChangesThis patch extends the MachO linker's map file generation to include branch extension thunk symbols. Previously, thunks were omitted from the map file, making it difficult to understand the final layout of the binary, especially when debugging issues related to long branch thunks. This change ensures thunks are included and correctly interleaved with other symbols based on their address, providing an accurate representation of the linked output. Full diff: https://github.com/llvm/llvm-project/pull/120496.diff 4 Files Affected:
diff --git a/lld/MachO/ConcatOutputSection.h b/lld/MachO/ConcatOutputSection.h
index 9af661d0ab1e0c..c2449e82b73ad1 100644
--- a/lld/MachO/ConcatOutputSection.h
+++ b/lld/MachO/ConcatOutputSection.h
@@ -25,8 +25,9 @@ class Defined;
// in the final binary.
class ConcatOutputSection : public OutputSection {
public:
- explicit ConcatOutputSection(StringRef name)
- : OutputSection(ConcatKind, name) {}
+ explicit ConcatOutputSection(StringRef name,
+ OutputSection::Kind kind = ConcatKind)
+ : OutputSection(kind, name) {}
const ConcatInputSection *firstSection() const { return inputs.front(); }
const ConcatInputSection *lastSection() const { return inputs.back(); }
@@ -46,7 +47,7 @@ class ConcatOutputSection : public OutputSection {
void writeTo(uint8_t *buf) const override;
static bool classof(const OutputSection *sec) {
- return sec->kind() == ConcatKind;
+ return sec->kind() == ConcatKind || sec->kind() == TextKind;
}
static ConcatOutputSection *getOrCreateForInput(const InputSection *);
@@ -66,12 +67,18 @@ class ConcatOutputSection : public OutputSection {
// support thunk insertion.
class TextOutputSection : public ConcatOutputSection {
public:
- explicit TextOutputSection(StringRef name) : ConcatOutputSection(name) {}
+ explicit TextOutputSection(StringRef name)
+ : ConcatOutputSection(name, TextKind) {}
void finalizeContents() override {}
void finalize() override;
bool needsThunks() const;
+ const std::vector<ConcatInputSection *> &getThunks() const { return thunks; }
void writeTo(uint8_t *buf) const override;
+ static bool classof(const OutputSection *sec) {
+ return sec->kind() == TextKind;
+ }
+
private:
uint64_t estimateStubsInRangeVA(size_t callIdx) const;
diff --git a/lld/MachO/MapFile.cpp b/lld/MachO/MapFile.cpp
index 9c0621622ae2f0..6fdea64cbefa04 100644
--- a/lld/MachO/MapFile.cpp
+++ b/lld/MachO/MapFile.cpp
@@ -161,6 +161,28 @@ static uint64_t getSymSizeForMap(Defined *sym) {
return sym->size;
}
+// Merges two vectors of input sections in order of their outSecOff values.
+// This approach creates a new (temporary) vector which is not ideal but the
+// ideal approach leads to a lot of code duplication.
+static std::vector<ConcatInputSection *>
+mergeOrderedInputs(const std::vector<ConcatInputSection *> &inputs1,
+ const std::vector<ConcatInputSection *> &inputs2) {
+ std::vector<ConcatInputSection *> vec;
+ size_t i = 0, ie = inputs1.size();
+ size_t t = 0, te = inputs2.size();
+ while (i < ie || t < te) {
+ while (i < ie &&
+ (t == te || inputs1[i]->outSecOff <= inputs2[t]->outSecOff)) {
+ vec.push_back(inputs1[i++]);
+ }
+ while (t < te &&
+ (i == ie || inputs2[t]->outSecOff < inputs1[i]->outSecOff)) {
+ vec.push_back(inputs2[t++]);
+ }
+ }
+ return vec;
+}
+
void macho::writeMapFile() {
if (config->mapFile.empty())
return;
@@ -220,7 +242,12 @@ void macho::writeMapFile() {
os << "# Address\tSize \tFile Name\n";
for (const OutputSegment *seg : outputSegments) {
for (const OutputSection *osec : seg->getSections()) {
- if (auto *concatOsec = dyn_cast<ConcatOutputSection>(osec)) {
+ const TextOutputSection *textOsec = dyn_cast<TextOutputSection>(osec);
+ if (textOsec && textOsec->getThunks().size()) {
+ auto inputsAndThunks =
+ mergeOrderedInputs(textOsec->inputs, textOsec->getThunks());
+ printIsecArrSyms(inputsAndThunks);
+ } else if (auto *concatOsec = dyn_cast<ConcatOutputSection>(osec)) {
printIsecArrSyms(concatOsec->inputs);
} else if (osec == in.cStringSection || osec == in.objcMethnameSection) {
const auto &liveCStrings = info.liveCStringsForSection.lookup(osec);
diff --git a/lld/MachO/OutputSection.h b/lld/MachO/OutputSection.h
index 5297a03c2cfa7f..9afd3a9eeb1928 100644
--- a/lld/MachO/OutputSection.h
+++ b/lld/MachO/OutputSection.h
@@ -37,6 +37,7 @@ class OutputSection {
enum Kind {
ConcatKind,
SyntheticKind,
+ TextKind,
};
OutputSection(Kind kind, StringRef name) : name(name), sectionKind(kind) {}
diff --git a/lld/test/MachO/arm64-thunks.s b/lld/test/MachO/arm64-thunks.s
index d887359bbc23e1..8d7bb154379bc4 100644
--- a/lld/test/MachO/arm64-thunks.s
+++ b/lld/test/MachO/arm64-thunks.s
@@ -8,14 +8,38 @@
## (4) early calls to a dylib stub use a thunk, and later calls the stub
## directly
## (5) Thunks are created for all sections in the text segment with branches.
+## (6) Thunks are in the linker map file.
## Notes:
## 0x4000000 = 64 Mi = half the magnitude of the forward-branch range
# RUN: rm -rf %t; mkdir %t
# RUN: llvm-mc -filetype=obj -triple=arm64-apple-darwin %s -o %t/input.o
-# RUN: %lld -arch arm64 -dead_strip -lSystem -U _extern_sym -o %t/thunk %t/input.o
+# RUN: %lld -arch arm64 -dead_strip -lSystem -U _extern_sym -map %t/thunk.map -o %t/thunk %t/input.o
# RUN: llvm-objdump --no-print-imm-hex -d --no-show-raw-insn %t/thunk | FileCheck %s
+## Check that the thunks appear in the map file and that everything is sorted by address
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _b
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _c
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _d.thunk.0
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _e.thunk.0
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _f.thunk.0
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _g.thunk.0
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _h.thunk.0
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] ___nan.thunk.0
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _d
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _e
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _f
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _g
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _a.thunk.0
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _b.thunk.0
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _h
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _main
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _c.thunk.0
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _d.thunk.1
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _e.thunk.1
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _f.thunk.1
+# MAP: [[0x[0-9A-Fa-f]+]] 0x[0-9A-Fa-f]+ \[[0-9]+\] _z
+
# CHECK: Disassembly of section __TEXT,__text:
# CHECK: [[#%.13x, A_PAGE:]][[#%.3x, A_OFFSET:]] <_a>:
|
lld/test/MachO/arm64-thunks.s
Outdated
| # Because of the `.space` instructions, there will end up being a lot of dead symbols in the linker map | ||
| # so generate a version of the linker map without dead symbols. | ||
| # RUN: awk '/# Dead Stripped Symbols:/ {exit} {print}' %t/thunk.map > %t/thunk_no_dead_syms.map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the point of this awk command. The last symbol you check is _z and then presumably there will be dead symbols after that. But the MAP check lines are done, so there is no problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that because the .space. commands, linker map ends up being 2.7GB in size - with apparently each character added via .space being interpreted as a dead symbol (this might be another unrelated issue).
So the issue is that if the test were to break, then it would take a very very long time to complete (I didn't wait to fully time it) - trying to match the regex against the 2.7GB file.
This is just an optimization to have the test complete in a practical amount of time - even in case of failure.
carlocab
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not super familiar with this part of LLD, so would be good to get an approval from someone else. But looks reasonable to me as far as I can tell.
|
I'm seeing some failures on our i386 builds after this commit: Stack trace from the log below, but it's possible that it is running out of memory. |
|
With this commit we do add a new temporary array that can be rather large for big binaries. So running out of memory on x86 is a possibility. Does it happen consistently ? Is this just when running |
|
It's been failing consistently the last 3 days we've run the builds. The CMake options we use are in the log file. I'm not sure if our config is considered special or not. |
|
I gathered some stats on my x86_64 machine with Before 162814a: After 162814a: So a pretty significant increase in memory usage and run time as well. I would consider this kind of run time increase a regression on all targets. Is there anything that can be done to speed up the tests/reduce the memory usage? |
|
#122785 gets rid of the extra memory required by this PR. |
This commit improves the memory efficiency of the lld-macho linker by optimizing how thunks are printed in the map file. Previously, merging vectors of input sections required creating a temporary vector, which increased memory usage and in some cases caused the linker to run out of memory as reported in comments on #120496. The new approach interleaves the printing of two arrays of ConcatInputSection in sorted order without allocating additional memory for a merged array.
…file (#122785) This commit improves the memory efficiency of the lld-macho linker by optimizing how thunks are printed in the map file. Previously, merging vectors of input sections required creating a temporary vector, which increased memory usage and in some cases caused the linker to run out of memory as reported in comments on llvm/llvm-project#120496. The new approach interleaves the printing of two arrays of ConcatInputSection in sorted order without allocating additional memory for a merged array.
This patch extends the MachO linker's map file generation to include branch extension thunk symbols. Previously, thunks were omitted from the map file, making it difficult to understand the final layout of the binary, especially when debugging issues related to long branch thunks. This change ensures thunks are included and correctly interleaved with other symbols based on their address, providing an accurate representation of the linked output.